Connected: A Social Network Analysis Tutorial with NetworkX

Presenters: Rob Chew & Peter Baumgartner

Installation

$ git clone https://github.com/rtidatascience/connected-nx-tutorial.git
$ cd connected-nx-tutorial
$ conda env create -f environment.yml
$ source activate connected

Outline

  • Introduction & Background
  • Creating Graphs
  • Visualizing Graphs
  • Centrality
  • Link Prediction
  • Community

What is Social Network Analysis?

  • Examples
    • Zachary's Karate Club
    • Florentine Marriages
    • Semantic Text Network
  • Definitions

Examples

Zachary's Karate Club Network

The Iris dataset of social network analysis

A social network of a karate club was studied by Wayne W. Zachary for a period of three years from 1970 to 1972. The network captures 34 members of a karate club, documenting 78 pairwise links between members who interacted outside the club. During the study a conflict arose which led to the split of the club into two. Based on collected data Zachary assigned correctly all but one member of the club to the groups they actually joined after the split.

There is even a Zachary's Karate Club CLUB, which awards a trophy to the first person at a network conference to use Zachary's Karate Club Network as an example

15th Century Florentine Marriages

[Padgett and Ansell, 1993]

The graph above is a marriage network of 16 influential Florentian families in the 1430s. At this time in Renaissance Italy, the major families were essentially an oligarchy, controlling politics and money in the region.

Based on this network, can you surmise which family ascended to power in the proceeding decades?

By examining the right networks, we can understand which actors are the most central. In this case, the network forecasts the Rise of the Medici's, even though they were not the most wealthy or most politically connected family at the time.

Semantic Text Network

A network of words in a document, connected and weighted by the frequency of appearance within 2-word and 5-word windows.

Paranyushkin, D. (2011). Identifying the pathways for meaning circulation using text network analysis. Berlin: Nodus Labs

Definitions

Network: a pattern of interconnections among a set of things [Source]

Social Network: a network where the things are people and the interconnections are social interactions

Social Network Analysis (SNA): the application of graph and network theory to investigate social structures.

Graph Theory: the study of graphs, which are mathematical structures used to model pairwise relations between objects.

Network Theory: the study of complex interacting systems that can be represented as graphs equipped with extra structure.

Parts of Graphs

Node / Vertex: The entity of analysis which has a relationship. Node is used in the network context, vertex is used in the graph theory context, but both terms are often used interchangeably.

Link / Edge / Relationship: The connections between the nodes. Link is used in the network context, edge is used in the graph theory context, and all words are used interchangably with relationship.

Attributes: Both nodes and edges can store attributes, which contain additional data about that object.

Weight: A common attribute of edges, used to indicate strength or value of a relationship.

Degree: Number of edges a node has.

Types of Graphs

Graphs are typically classified based on the presence of weights and direction attached to the edges in a graph. The table below covers what we call each type of graph:

Absent Present
Weights Unweighted Weighted
Directionality Undirected Directed

Additional flavors: parallel edges, self-loops, n-partite graphs

In context:

We are talking about a(n) [unweighted/weighted] [undirected/directed] graph (with [parallel edges | self loops]).


Network: a pattern of interconnections among a set of things